Skip to content

Conversation

@astroshim
Copy link
Contributor

What is this PR for?

This PR is for the documentation of running zeppelin on production environments especially spark on yarn.
Related issue is #1227 and I got a lot of hints from https://github.com/sequenceiq/hadoop-docker.
Tested on ubuntu.

What type of PR is it?

Documentation

What is the Jira issue?

https://issues.apache.org/jira/browse/ZEPPELIN-1280

Questions:

  • Does the licenses files need update? no
  • Is there breaking changes for older versions? no
  • Does this needs documentation? no



```
ps -ef
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean ps -ef | grep spark ?

Copy link
Contributor Author

@astroshim astroshim Aug 11, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hadoop is also running so just ps -ef is the best way?

Copy link
Contributor

@AhyoungRyu AhyoungRyu Aug 11, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right. But i just wanted to filter processes list.

@AhyoungRyu
Copy link
Contributor

AhyoungRyu commented Aug 11, 2016

@astroshim Great work indeed! As just proof reading spark_cluster_mode.md, I updated a few minor things in here. Could you check this one please?

@astroshim
Copy link
Contributor Author

astroshim commented Aug 11, 2016

@AhyoungRyu Thank you very much for your effort. 👍

Minor update for spark_cluster_mode.md
<li class="title"><span><b>Advanced</b><span></li>
<li><a href="{{BASE_PATH}}/install/virtual_machine.html">Zeppelin on Vagrant VM</a></li>
<li><a href="{{BASE_PATH}}/install/spark_cluster_mode.html#spark-standalone-mode">Zeppelin on Spark Cluster Mode (Standalone)</a></li>
<li><a href="{{BASE_PATH}}/install/spark_cluster_mode.html#spark-standalone-mode">Zeppelin on Spark Cluster Mode (Yarn)</a></li>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably will be a good idea to all cap YARN
http://spark.apache.org/docs/latest/running-on-yarn.html

@astroshim
Copy link
Contributor Author

astroshim commented Aug 11, 2016

@felixcheung Thank you very much for detail review. 👍
I'll fix them but I wonder if what spark and hadoop version should be supported in this document.

@felixcheung
Copy link
Member

It's hard to say - I think one approach would be latest (spark 2.0 & Hadoop 2.7); another approach would be the most popular ones

@astroshim
Copy link
Contributor Author

astroshim commented Aug 11, 2016

Do you know what version of spark&hadoop is popular? I can test it.

@astroshim
Copy link
Contributor Author

Spark2.0 & hadoop2.3 is working well.
image

@felixcheung
Copy link
Member

cool. hadoop versions in distributions:
CDH: 2.6.0
HDP/Azure: 2.7.1
EMR: 2.7.2
GCP Dataproc: 2.7.2

@astroshim
Copy link
Contributor Author

Then what about supporting latest verison(Spark2.0 & hadoop2.7)?

@bzz
Copy link
Member

bzz commented Aug 12, 2016

Docs looks great to me, thank you @astroshim !

@astroshim
Copy link
Contributor Author

Spark2.0 & hadoop2.7 is working well.

  • YARN
    image
  • hdfs
    image

as you can see test spark-submit job was successful but zeppelin job doesn't work properly.
I'll take a look at problem later.

@astroshim
Copy link
Contributor Author

astroshim commented Aug 16, 2016

I got a following error when I try to run zeppelin with spark2.0&hadoop2.7.

ERROR [2016-08-16 16:43:33,121] ({pool-1-thread-3} Utils.java[invokeMethod]:40) -
java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:38)
        at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:33)
        at org.apache.zeppelin.spark.SparkInterpreter.createSparkSession(SparkInterpreter.java:345)
        at org.apache.zeppelin.spark.SparkInterpreter.getSparkSession(SparkInterpreter.java:218)
        at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:743)
        at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
        at org.apache.zeppelin.interpreter.LazyOpenInterpreter.getProgress(LazyOpenInterpreter.java:110)
        at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.getProgress(RemoteInterpreterServer.java:447)
        at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$getProgress.getResult(RemoteInterpreterService.java:1701)
        at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$getProgress.getResult(RemoteInterpreterService.java:1686)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.yarn.conf.YarnConfiguration
        at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil.newConfiguration(YarnSparkHadoopUtil.scala:71)
        at org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:54)
        at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil.<init>(YarnSparkHadoopUtil.scala:56)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at java.lang.Class.newInstance(Class.java:383)
        at org.apache.spark.deploy.SparkHadoopUtil$.liftedTree1$1(SparkHadoopUtil.scala:414)
        at org.apache.spark.deploy.SparkHadoopUtil$.yarn$lzycompute(SparkHadoopUtil.scala:412)
        at org.apache.spark.deploy.SparkHadoopUtil$.yarn(SparkHadoopUtil.scala:412)
        at org.apache.spark.deploy.SparkHadoopUtil$.get(SparkHadoopUtil.scala:437)
        at org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:2203)
        at org.apache.spark.storage.BlockManager.<init>(BlockManager.scala:104)
        at org.apache.spark.SparkEnv$.create(SparkEnv.scala:320)
        at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:165)
        at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:259)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:423)
        at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2256)
        at org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:831)
        at org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:823)
        at scala.Option.getOrElse(Option.scala:121)
        at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:823)
        ... 20 more

My build command is

mvn clean package -Pspark-2.0 -Phadoop-2.7 -Dhadoop.version=2.7.2 -Pyarn -Ppyspark -Pscala-2.11 -DskipTests 

but hadoop library for spark interpreter is

~/zeppelin$ ls -al ./spark/target/lib/hadoop-*
-rw-rw-r-- 1 hsshim hsshim   17385  8월 16 23:52 ./spark/target/lib/hadoop-annotations-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim   49750  8월 16 23:52 ./spark/target/lib/hadoop-auth-2.2.0.jar
-rw-rw-r-- 1 hsshim hsshim    2559  8월 16 23:52 ./spark/target/lib/hadoop-client-2.2.0.jar
-rw-rw-r-- 1 hsshim hsshim 2735584  8월 16 23:52 ./spark/target/lib/hadoop-common-2.2.0.jar
-rw-rw-r-- 1 hsshim hsshim 5242252  8월 16 23:52 ./spark/target/lib/hadoop-hdfs-2.2.0.jar
-rw-rw-r-- 1 hsshim hsshim  482042  8월 16 23:52 ./spark/target/lib/hadoop-mapreduce-client-app-2.2.0.jar
-rw-rw-r-- 1 hsshim hsshim  656365  8월 16 23:52 ./spark/target/lib/hadoop-mapreduce-client-common-2.2.0.jar
-rw-rw-r-- 1 hsshim hsshim 1455001  8월 16 23:52 ./spark/target/lib/hadoop-mapreduce-client-core-2.2.0.jar
-rw-rw-r-- 1 hsshim hsshim   35216  8월 16 23:52 ./spark/target/lib/hadoop-mapreduce-client-jobclient-2.2.0.jar
-rw-rw-r-- 1 hsshim hsshim   21537  8월 16 23:52 ./spark/target/lib/hadoop-mapreduce-client-shuffle-2.2.0.jar
-rw-rw-r-- 1 hsshim hsshim 2015575  8월 16 23:52 ./spark/target/lib/hadoop-yarn-api-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim   94728  8월 16 23:52 ./spark/target/lib/hadoop-yarn-client-2.2.0.jar
-rw-rw-r-- 1 hsshim hsshim 1301627  8월 16 23:52 ./spark/target/lib/hadoop-yarn-common-2.2.0.jar
-rw-rw-r-- 1 hsshim hsshim  175554  8월 16 23:52 ./spark/target/lib/hadoop-yarn-server-common-2.2.0.jar
-rw-rw-r-- 1 hsshim hsshim   25710  8월 16 23:52 ./spark/target/lib/hadoop-yarn-server-web-proxy-2.2.0.jar

Maybe the error occurs because different versions of the hadoop library.
I will make an PR for this.

@felixcheung
Copy link
Member

Looks like some of the Hadoop jars are 2.2 instead of 2.7?

@astroshim
Copy link
Contributor Author

@felixcheung Yes 2.2, Maybe it's because my maven repo has different versions of hadoop libraries like following.

~/zeppelin$ ls -al ~/.m2/repository/org/apache/hadoop/hadoop-common/
total 36
drwxrwxr-x  9 hsshim hsshim 4096  8월 16 17:03 .
drwxrwxr-x 25 hsshim hsshim 4096  8월 17 00:22 ..
drwxrwxr-x  2 hsshim hsshim 4096  8월 16 17:04 2.2.0
drwxrwxr-x  2 hsshim hsshim 4096  6월 22 12:04 2.3.0
drwxrwxr-x  2 hsshim hsshim 4096  6월 22 00:07 2.4.0
drwxrwxr-x  2 hsshim hsshim 4096  6월 22 00:14 2.5.1
drwxrwxr-x  2 hsshim hsshim 4096  8월  4 19:54 2.6.0
drwxrwxr-x  2 hsshim hsshim 4096  8월 16 14:50 2.7.0
drwxrwxr-x  2 hsshim hsshim 4096  7월 14 16:58 2.7.2

so I fix for this on #1335.

@astroshim
Copy link
Contributor Author

I got success zeppelin job with spark2.0&hadoop2.7 after apply #1335.

  • hadoop libraries.
~/zeppelin$ ls -al ./spark/target/lib/hadoop-*
-rw-rw-r-- 1 hsshim hsshim   17385  8월 17 00:51 ./spark/target/lib/hadoop-annotations-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim   70685  8월 17 00:51 ./spark/target/lib/hadoop-auth-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim    2545  8월 17 00:51 ./spark/target/lib/hadoop-client-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim 3443040  8월 17 00:51 ./spark/target/lib/hadoop-common-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim 8268375  8월 17 00:51 ./spark/target/lib/hadoop-hdfs-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim  516614  8월 17 00:51 ./spark/target/lib/hadoop-mapreduce-client-app-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim  753123  8월 17 00:51 ./spark/target/lib/hadoop-mapreduce-client-common-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim 1531485  8월 17 00:51 ./spark/target/lib/hadoop-mapreduce-client-core-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim   38213  8월 17 00:51 ./spark/target/lib/hadoop-mapreduce-client-jobclient-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim   48268  8월 17 00:51 ./spark/target/lib/hadoop-mapreduce-client-shuffle-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim 2015575  8월 17 00:51 ./spark/target/lib/hadoop-yarn-api-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim  142639  8월 17 00:51 ./spark/target/lib/hadoop-yarn-client-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim 1653294  8월 17 00:51 ./spark/target/lib/hadoop-yarn-common-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim  364376  8월 17 00:51 ./spark/target/lib/hadoop-yarn-server-common-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim   34953  8월 17 00:51 ./spark/target/lib/hadoop-yarn-server-web-proxy-2.7.2.jar
  • zeppelin screen.
    image
  • yarn applications screen.
    image

@zjffdu
Copy link
Contributor

zjffdu commented Aug 16, 2016

@astroshim I can use spark 2.0 and hadoop 2.7 successfully. I hit this issue when building zeppelin with profile yarn enabled. So please don't enable yarn profile otherwise you will get hadoop version mismatch. I have left a comment in #1301 to remove yarn profile.

@astroshim
Copy link
Contributor Author

@zjffdu I just tested and got success with removing the yarn profile.
I will close the #1335 then.
Thank you.

@astroshim
Copy link
Contributor Author

Please merge this if there is no more discussion because I want to make document for https://issues.apache.org/jira/browse/ZEPPELIN-1279.

@zjffdu
Copy link
Contributor

zjffdu commented Aug 18, 2016

The jira title seems a little confusing to me. The PR is for running spark on yarn by docker, but I don't think users will use docker for production for now.

@astroshim
Copy link
Contributor Author

@zjffdu You're right, usually users don't make their production using docker.
but We have been received many question about running Zeppelin on their production environments so I think this PR gives hints users to solve their problems and making their production environments too.
Does this make sense?

@zjffdu
Copy link
Contributor

zjffdu commented Aug 18, 2016

It would be better to change the title to reflect the docker. I think we should mention docker is only for small experimental environment rather than production environment. Besides that, I don't know how much complicated of using docker, I would be more conservative to bring extras dependencies, especially when it is complicated and not usually needed in real environment. We can hear more feedback from people who know more about docker.

@astroshim
Copy link
Contributor Author

I can update PR title
but This PR is one of the https://issues.apache.org/jira/browse/ZEPPELIN-1198 and
https://issues.apache.org/jira/browse/ZEPPELIN-1278 is already merged.
and If you build this PR, you can see the title on doc like Zeppelin on Spark Cluster Mode (YARN via Docker).

@zjffdu
Copy link
Contributor

zjffdu commented Aug 18, 2016

Thanks @astroshim, I have no other concerns.

@astroshim astroshim changed the title [ZEPPELIN-1280][Spark on Yarn] Documents for running zeppelin on production environments. [ZEPPELIN-1280][Spark on Yarn] Documents for running zeppelin on production environments using docker. Aug 18, 2016
@felixcheung
Copy link
Member

I agree we could be more specific on the title/subject for this document.

But lots of company run production on Docker though, just FYI. Either Docker by itself on premise or in the cloud, with something like DC/OS.

@AhyoungRyu
Copy link
Contributor

AhyoungRyu commented Aug 29, 2016

Can this be merged now? :)

@bzz
Copy link
Member

bzz commented Aug 29, 2016

Looks great to me.

Merging to master, if there is no further discussion

@asfgit asfgit closed this in eccfe00 Aug 29, 2016
asfgit pushed a commit that referenced this pull request Sep 3, 2016
### What is this PR for?
This PR is for the documentation of running zeppelin on production environments especially spark on mesos via Docker.
Related issue is #1227 and #1318 and I got a lot of hints from https://github.com/sequenceiq/hadoop-docker.
Tested on ubuntu.

### What type of PR is it?
Documentation

### What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-1279

### How should this be tested?
You can refer to https://github.com/apache/zeppelin/blob/master/docs/README.md#build-documentation.

### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? no

Author: astroshim <[email protected]>
Author: AhyoungRyu <[email protected]>
Author: HyungSung <[email protected]>

Closes #1389 from astroshim/ZEPPELIN-1279 and squashes the following commits:

974366a [HyungSung] Merge pull request #10 from AhyoungRyu/ZEPPELIN-1279-ahyoung
076fdba [AhyoungRyu] Change zeppelin_mesos_conf.png file
1cbe9d3 [astroshim] fix spark version and mesos
2b821b4 [astroshim] fix docs
159bafc [astroshim] fix anchor
d8c43b4 [astroshim] add navigation
c808350 [astroshim] add image file and doc
a3b0ded [astroshim] create dockerfile for mesos
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants